Goto

Collaborating Authors

 amino acid residue


Extracting Inter-Protein Interactions Via Multitasking Graph Structure Learning

Li, Jiang, Li, Yuan-Ting

arXiv.org Artificial Intelligence

Identifying protein-protein interactions (PPI) is crucial for gaining in-depth insights into numerous biological processes within cells and holds significant guiding value in areas such as drug development and disease treatment. Currently, most PPI prediction methods focus primarily on the study of protein sequences, neglecting the critical role of the internal structure of proteins. This paper proposes a novel PPI prediction method named MgslaPPI, which utilizes graph attention to mine protein structural information and enhances the expressive power of the protein encoder through multitask learning strategy. Specifically, we decompose the end-to-end PPI prediction process into two stages: amino acid residue reconstruction (A2RR) and protein interaction prediction (PIP). In the A2RR stage, we employ a graph attention-based residue reconstruction method to explore the internal relationships and features of proteins. In the PIP stage, in addition to the basic interaction prediction task, we introduce two auxiliary tasks, i.e., protein feature reconstruction (PFR) and masked interaction prediction (MIP). The PFR task aims to reconstruct the representation of proteins in the PIP stage, while the MIP task uses partially masked protein features for PPI prediction, with both working in concert to prompt MgslaPPI to capture more useful information. Experimental results demonstrate that MgslaPPI significantly outperforms existing state-of-the-art methods under various data partitioning schemes.


Boosting Protein Language Models with Negative Sample Mining

Xu, Yaoyao, Zhao, Xinjian, Song, Xiaozhuang, Wang, Benyou, Yu, Tianshu

arXiv.org Artificial Intelligence

We introduce a pioneering methodology for boosting large language models in the domain of protein representation learning. Our primary contribution lies in the refinement process for correlating the over-reliance on co-evolution knowledge, in a way that networks are trained to distill invaluable insights from negative samples, constituted by protein pairs sourced from disparate categories. By capitalizing on this novel approach, our technique steers the training of transformer-based models within the attention score space. This advanced strategy not only amplifies performance but also reflects the nuanced biological behaviors exhibited by proteins, offering aligned evidence with traditional biological mechanisms such as protein-protein interaction. We experimentally observed improved performance on various tasks over datasets, on top of several well-established large protein models. This innovative paradigm opens up promising horizons for further progress in the realms of protein research and computational biology.


AI driven B-cell Immunotherapy Design

da Silva, Bruna Moreira, Ascher, David B., Geard, Nicholas, Pires, Douglas E. V.

arXiv.org Artificial Intelligence

Bruna Moreira da Silva is a PhD student at The University of Melbourne. Her research interests are in bioinformatics, immunoinformatics and machine learning to advance Global Health. David B. Ascher is the Director of Biotechnology at the University of Queensland and head of Computational Biology and Clinical Informatics at the Baker Institute and Systems and Computational Biology at Bio21 Institute. He is interested in developing and applying computational tools to assist leveraging clinical and omics data for drug discovery and personalised medicine. Nicholas Geard is an Associate Professor at the School of Computing and Information Systems at the University of Melbourne and Director of the Melbourne Data Analytics Platform. He is a computer scientist specialising in computational simulation applied to a range of problems in health and epidemiology. Douglas E. V. Pires is an Associate Professor in Digital Health at the School of Computing and Information Systems at the University of Melbourne and group leader at the Bio21 Institute. He is a computer scientist and bioinformatician specialising in machine learning and AI and the development of the next generation of tools to analyse omics data, and guide drug discovery and personalised medicine. ABSTRACT Antibodies, a prominent class of approved biologics, play a crucial role in detecting foreign antigens. The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction, which demands resource-intensive experimental techniques for characterisation. In recent years, artificial intelligence and machine learning methods have made significant strides, revolutionising the prediction of protein structures and their complexes. The past decade has also witnessed the evolution of computational approaches aiming to support immunotherapy design. This review focuses on the progress of machine learning-based tools and their frameworks in the domain of B-cell immunotherapy design, encompassing linear and conformational epitope prediction, paratope prediction, and antibody design. We mapped the most commonly used data sources, evaluation metrics, and method availability and thoroughly assessed their significance and limitations, discussing the main challenges ahead. INTRODUCTION Therapeutic antibodies are a rapidly growing class of biopharmaceuticals with potentially exceptional antigen specificity and affinity. Their ability to detect and eliminate a wide array of foreign threats makes them suitable for a range of potential therapeutic and diagnostic applications. Antibody and antigen engineering have been greatly benefited by the evolution of research in computational biology, leading to innovative approaches in screening antibody targets, optimising their biochemical and physical properties, predicting and optimising binding affinity and understanding escape mutations [1].


Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking

Ganea, Octavian-Eugen, Huang, Xinyuan, Bunne, Charlotte, Bian, Yatao, Barzilay, Regina, Jaakkola, Tommi, Krause, Andreas

arXiv.org Artificial Intelligence

Protein complex formation is a central problem in biology, being involved in most of the cell's processes, and essential for applications, e.g. We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures, assuming no conformational change within the proteins happens during binding. We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right docked position relative to the second protein. We mathematically guarantee a basic principle: the predicted complex is always identical regardless of the initial locations and orientations of the two structures. Empirically, we achieve significant running time improvements and often outperform existing docking software despite not relying on heavy candidate sampling, structure refinement, or templates. Besides their complex three-dimensional nature, Figure 1: Different views of the 3D structure proteins dynamically alter their function and structure of a protein complex. In particular, protein interactions are involved in various biological processes including signal transduction, protein synthesis, DNA replication and repair. Molecular docking is key to understanding protein interactions' mechanisms and effects, and, subsequently, to developing therapeutic interventions.


AlphaFold advances protein folding research

AIHub

The grand challenge of protein folding hit the news this week when it was announced that the latest version of DeepMind's AlphaFold system had predicted protein structures with very high accuracy in CASP's 2020 experiment. Proteins are large, complex molecules, and the shape of a particular protein is closely linked to the function it performs. The ability to accurately predict protein structures would enable scientists to gain a greater understanding of how they work and what they do. This new version of AlphaFold builds on the initial system, which you can read about in this paper. The associated code is available here.


DeepMind's improved protein-folding prediction AI could accelerate drug discovery

#artificialintelligence

It's these genetic definitions that circumscribe their three-dimensional structures, which in turn determines their capabilities. But protein "folding," as it's called, is notoriously difficult to figure out from a corresponding genetic sequence alone. DNA contains only information about chains of amino acid residues and not those chains' final form. In December 2018, DeepMind attempted to tackle the challenge of protein folding with a machine learning system called AlphaFold. The product of two years of work, the Alphabet subsidiary said at the time that AlphaFold could predict structures more precisely than prior solutions.


Combining Machine Learning and Optimization Techniques to Determine 3-D Structures of Polypeptides

Dorn, Marcio (Federal University of Rio Grande do Sul) | Buriol, Luciana Salete (Federal University of Rio Grande do Sul) | Lamb, Luis da Cunha (Federal University of Rio Grande do Sul)

AAAI Conferences

One of the main research problems in Structural Bioinformatics is the analysis and prediction of three-dimensional structures (3-D) of polypeptides or proteins. The 1990’s Genome projects resulted in a large increase in the number of protein sequences. However, the number of identified 3-D protein structures has not followed the same trend.The determination of protein structure is experimentally expensive and time consuming. This makes scientists largely dependent on computational methods that can predict correct 3-D protein structures only from extended and full amino acid sequences. Several computational methodologies and algorithms have been proposed as a solution to the Protein Structure Prediction (PSP) problem. We briefly describe the AI techniques we have been used to tackle this problem.